169 research outputs found
Introducing Geometry in Active Learning for Image Segmentation
We propose an Active Learning approach to training a segmentation classifier
that exploits geometric priors to streamline the annotation process in 3D image
volumes. To this end, we use these priors not only to select voxels most in
need of annotation but to guarantee that they lie on 2D planar patch, which
makes it much easier to annotate than if they were randomly distributed in the
volume. A simplified version of this approach is effective in natural 2D
images. We evaluated our approach on Electron Microscopy and Magnetic Resonance
image volumes, as well as on natural images. Comparing our approach against
several accepted baselines demonstrates a marked performance increase
Learning Active Learning from Data
In this paper, we suggest a novel data-driven approach to active learning
(AL). The key idea is to train a regressor that predicts the expected error
reduction for a candidate sample in a particular learning state. By formulating
the query selection procedure as a regression problem we are not restricted to
working with existing AL heuristics; instead, we learn strategies based on
experience from previous AL outcomes. We show that a strategy can be learnt
either from simple synthetic 2D datasets or from a subset of domain-specific
data. Our method yields strategies that work well on real data from a wide
range of domains
A Positive/Unlabeled Approach for the Segmentation of Medical Sequences using Point-Wise Supervision
The ability to quickly annotate medical imaging data plays a critical role in training deep learning frameworks for segmentation. Doing so for image volumes or video sequences is even more pressing as annotating these is particularly burdensome. To alleviate this problem, this work proposes a new method to efficiently segment medical imaging volumes or videos using point-wise annotations only. This allows annotations to be collected extremely quickly and remains applicable to numerous segmentation tasks. Our approach trains a deep learning model using an appropriate Positive/Unlabeled objective function using sparse point-wise annotations. While most methods of this kind assume that the proportion of positive samples in the data is known a-priori, we introduce a novel self-supervised method to estimate this prior efficiently by combining a Bayesian estimation framework and new stopping criteria. Our method iteratively estimates appropriate class priors and yields high segmentation quality for a variety of object types and imaging modalities. In addition, by leveraging a spatio-temporal tracking framework, we regularize our predictions by leveraging the complete data volume. We show experimentally that our approach outperforms state-of-the-art methods tailored to the same problem
Iterative multi-path tracking for video and volume segmentation with sparse point supervision
Recent machine learning strategies for segmentation tasks have shown great
ability when trained on large pixel-wise annotated image datasets. It remains a
major challenge however to aggregate such datasets, as the time and monetary
cost associated with collecting extensive annotations is extremely high. This
is particularly the case for generating precise pixel-wise annotations in video
and volumetric image data. To this end, this work presents a novel framework to
produce pixel-wise segmentations using minimal supervision. Our method relies
on 2D point supervision, whereby a single 2D location within an object of
interest is provided on each image of the data. Our method then estimates the
object appearance in a semi-supervised fashion by learning
object-image-specific features and by using these in a semi-supervised learning
framework. Our object model is then used in a graph-based optimization problem
that takes into account all provided locations and the image data in order to
infer the complete pixel-wise segmentation. In practice, we solve this
optimally as a tracking problem using a K-shortest path approach. Both the
object model and segmentation are then refined iteratively to further improve
the final segmentation. We show that by collecting 2D locations using a gaze
tracker, our approach can provide state-of-the-art segmentations on a range of
objects and image modalities (video and 3D volumes), and that these can then be
used to train supervised machine learning classifiers
Multi-Environment Model Estimation for Motility Analysis of \u3cem\u3eCaenorhabditis elegans\u3c/em\u3e
The nematode Caenorhabditis elegans is a well-known model organism used to investigate fundamental questions in biology. Motility assays of this small roundworm are designed to study the relationships between genes and behavior. Commonly, motility analysis is used to classify nematode movements and characterize them quantitatively. Over the past years, C. elegans’ motility has been studied across a wide range of environments, including crawling on substrates, swimming in fluids, and locomoting through microfluidic substrates. However, each environment often requires customized image processing tools relying on heuristic parameter tuning. In the present study, we propose a novel Multi Environment Model Estimation (MEME) framework for automated image segmentation that is versatile across various environments. The MEME platform is constructed around the concept of Mixture of Gaussian (MOG) models, where statistical models for both the background environment and the nematode appearance are explicitly learned and used to accurately segment a target nematode. Our method is designed to simplify the burden often imposed on users; here, only a single image which includes a nematode in its environment must be provided for model learning. In addition, our platform enables the extraction of nematode ‘skeletons’ for straightforward motility quantification. We test our algorithm on various locomotive environments and compare performances with an intensity-based thresholding method. Overall, MEME outperforms the threshold-based approach for the overwhelming majority of cases examined. Ultimately, MEME provides researchers with an attractive platform for C. elegans’ segmentation and ‘skeletonizing’ across a wide range of motility assays
Learning non-linear invariants for unsupervised out-of-distribution detection
An important hurdle to overcome before machine learning models can be reliably deployed in practice is identifying when samples are different from those seen during training, as the output for unexpected samples are often confidently incorrect, while not being identifiable as such. This problem is known as out-of-distribution (OOD) detection. A popular approach for the unsupervised OOD case is to reject samples with a high Mahalanobis distance with regards to the mean features of the training data. Recent work showed that the Mahalanobis distance can be thought of as finding the training data invariants, and rejecting OOD samples that violate them. A key limitation to this approach is that it is limited to linear relations only. Here, we present a novel method capable of identifying non-linear invariants in the data. These are learned using a reversible neural network, consisting of alternating rotation and coupling layers. Results on a varied number of tasks show it to be the best method overall, and achieving state-of-the-art results on some of the experiments
Active Testing for Face Detection and Localization
We provide a novel search technique which uses a hierarchical model and a mutual information gain heuristic to efficiently prune the search space when localizing faces in images. We show exponential gains in computation over traditional sliding window approaches, while keeping similar performance levels
Logical Implications for Visual Question Answering Consistency
Despite considerable recent progress in Visual Question Answering (VQA) models, inconsistent or contradictory answers continue to cast doubt on their true reasoning capabilities. However, most proposed methods use indirect strategies or strong assumptions on pairs of questions and answers to enforce model consistency. Instead, we propose a novel strategy intended to improve model performance by directly reducing logical inconsistencies. To do this, we introduce a new consistency loss term that can be used by a wide range of the VQA models and which relies on knowing the logical relation between pairs of questions and answers. While such information is typically not available in VQA datasets, we propose to infer these logical relations using a dedicated language model and use these in our proposed consistency loss function. We conduct extensive experiments on the VQA Introspect and DME datasets and show that our method brings improvements to state-of-the-art VQA models while being robust across different architectures and settings
Mask then classify: multi-instance segmentation for surgical instruments.
PURPOSE
The detection and segmentation of surgical instruments has been a vital step for many applications in minimally invasive surgical robotics. Previously, the problem was tackled from a semantic segmentation perspective, yet these methods fail to provide good segmentation maps of instrument types and do not contain any information on the instance affiliation of each pixel. We propose to overcome this limitation by using a novel instance segmentation method which first masks instruments and then classifies them into their respective type.
METHODS
We introduce a novel method for instance segmentation where a pixel-wise mask of each instance is found prior to classification. An encoder-decoder network is used to extract instrument instances, which are then separately classified using the features of the previous stages. Furthermore, we present a method to incorporate instrument priors from surgical robots.
RESULTS
Experiments are performed on the robotic instrument segmentation dataset of the 2017 endoscopic vision challenge. We perform a fourfold cross-validation and show an improvement of over 18% to the previous state-of-the-art. Furthermore, we perform an ablation study which highlights the importance of certain design choices and observe an increase of 10% over semantic segmentation methods.
CONCLUSIONS
We have presented a novel instance segmentation method for surgical instruments which outperforms previous semantic segmentation-based methods. Our method further provides a more informative output of instance level information, while retaining a precise segmentation mask. Finally, we have shown that robotic instrument priors can be used to further increase the performance
- …